ESTIMATION  OF  THE  DETECTION  PERFORMANCE 

OF  A DISPLAY 


Signal  detection  performance  is  commonly  described  by  *^ceiver 
Operating  Characteristic*  (ROC)  curves  plotting  the  probability  of  signal 
detection  (PD)  versus  the  probability  of  false  alarm  (PFA)  over  a range  of 
detection  thresholds .-^This  report  describes  a method  for  determining  ROC 
curves  for  visual  disjdays  and  presents  results  obtained  from  display  evalua- 
tions. 


The  visible  displays  discussed  here  will  represent  functions  derived 
from  a meaningless  random  noise  waveform  that  has  a particular  controlled 
waveform  added  whenever  a "signal  plus  noise*  message  is  to  be  displayed. 
The  addition  of  signal  will  be  detected  by  the  change  in  visible  patterns  from 
a meaningless  noise  background  to  a significantly  ordered  pattern. 

The  signal -to -noise  ratios  are  the  relative  power  in  the  signal  and 
noise  waveforms  combined  to  create  signal  messages. 


Method  for  Obtaining  ROC  Curves  for  Visual  Displays 


The  method  is  based  on  the  following  assumptions 


The  observer  can  set  a "threshold"  which  he  can  use  to  decide 
whether  a "signal”  is  present.  This  threshold  will  determine 
the  PD  and  PFA. 


2.  The  observer  can  maintain  a threshold  through  a sequence  of 
observations. 


3.  The  observer  can  change  his  threshold  or  equivalently  can 
describe  his  degree  of  confidence  that  a signal  is  present. 

Under  these  assumptions,  a "rating  scale"  method  will  provide 
several  points  on  an  RCKI  curve  from  a single  series  of  observations.  The 
ratings  are  the  observers’  relative  degree  of  confidence  that  a signal  created 
the  pattern  observed.  For  example,  sui  integer  scale  0 through  10  may  be 
used.  The  observer  would  rate  an  observation  as  0 if  he  saw  no  evidence  at 
all  of  signal.  A rating  of  5 would  indicate  the  observers'  feeling  that  no 
signal  or  signals  were  equally  probable.  A 10  rating  would  signify  unqualifie 
confidence  that  a signal  exists.  Degrees  of  confidence  between  these  rat|*I^ 
would  be  recorded  as  intervening  integers. 
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"■See  Bibliograidiy 


The  raticmale  for  using  a rating  scale  to  obtain  points  on  an  ROC 
curve  can  be  explained  in  terms  of  probability  density  distributions. 

Assume  that  a detection  function  has  one  probability  density  distribution  for 
noise  and  a different  distribution  for  signal  plus  noise  as  shown  in  Figure 
1.  Now  assume  that  a detection  function  threshold  is  established  as  shown. 
The  area  under  the  portion  of  the  S + N probability  density  distribution 
curve  to  the  right  of  the  threshold  is  the  PD.  Correspondingly,  the  PFA  is 
the  area  under  the  portion  of  the  noise  only  curve  to  the  right  of  the  thresh- 
old. By  selecting  a series  of  threshold  points  one  can  determine  the  PD 
and  PFA  curves  as  a function  of  threshold  settings. 
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Figure  1 Signal  detection  relations. 


2 


In  the  rating  scale  test  of  detection  performance  the  observer 
specifies  a series  of  relative  thresholds.  For  example,  consider  a test  in 
which  the  observer  rates  observations  of  308  noise  cases  and  96  signal  plus 
noise  cases  presented  in  random  sequence.  Assume  that  he  has  classified 
the  S+N  and  noise  only  cases  as  follows: 


Ratine 


Signal  + Noise 
No.  of  Obser.  Sum 


Noise 

No.  of  Obser.  Sum 


10 

5 

5 

5.2 

1 

1 

9 

13 

18 

19 

2 

3 

1 

8 

6 

24 

25 

1 

4 

1 

7 

9 

33 

34 

9 

13 

4 

6 

15 

48 

50 

20 

33 

11 

5 

12 

60 

62 

22 

55 

18 

4 

1 

61 

63 

13 

68 

22 

3 

5 

66 

69 

18 

86 

28 

2 

8 

74 

77 

17 

103 

33 

1 

7 

81 

84 

57 

160 

52 

0 

15 

96 

100 

148 

308 

100 

4.2’  /> 

jj'  fiU. 

18 

22  '■»  ] 

28  f III 

33  j'3'i!^..  0, 


At  the  10  level  he  has  correctly  identified  5 cases  of  signal  + noise 
and  had  made  one  mistake  by  identifying  noise  as  signal  plus  noise.  Thus, 
one  point  on  the  ROC  is  5. 2i  PD  vs  0.  34  PFA.  At  the  9 level  the  observer 
identified  13  true  signals  and  made  2 false  calls.  What  the  observer  means 
is  that  9 is  the  highest  threshold  level  at  which  he  would  call  these  cases 
signal  + noise.  Thus,  the  total  number  of  cases  the  observer  would  call 
signal  + noise  (for  true  signal  + noise  cases)  for  a threshold  set  at  9 would  be 
the  number  at  the  threshold  level  9 (5)  plus  the  number  at  level  10  (13)  for  a 
sum  equal  to  18  for  signal  and  a sum  of  3 for  false  calls.  Thus,  these 
cumulative  sums  provide  a second  ROC  point  of  19)<  PD  vs  l.O^J  PFA.  Con- 
tinuing this  summing  process  through  successively  lower  threshold  levels, 
we  generate  the  RCX:  curve  plotted  in  Figure  2.  This  ROC  curve  represents 
performance  for  a single  signal -to -noise  ratio.  Different  signal -to-noise 
ratios  generate  different  ROC  curves,  as  outlined  in  Figure  3. 

In  typical  studies  of  detection  performance  a separate  series  of 
test  observations  is  made  for  each  signal -to-noise  ratio--that  is,  each  test 
series  produces  one  ROC  curve.  In  theory,  a single  series  of  test  observa- 
tions could  provide  more  than  one  ROC  curve  if  two  or  more  signal -to -noise 
ratios  were  represented  in  the  signal  + noise  test  cases  observed,  fo  the 
experiments  reported  here,  we  mixed  examples  of  3 signal -to -noise  ratios 
witti  noise  only  cases  in  random  order.  Thus,  we  generated  3 ROC  curves 
from  each  evaluation  test.  These  ROC  curves  seemed  to  bear  reasonable 
relations  to  each  other  in  light  of  the  signal  to  noise  ratios  represented. 
Evaluating  observer  disfday  detection  performance  for  mixed  signal -to -noise 
ratios  corresponds  to  real  world  situations  in  which  signal -to -noise  ratios 
are  variable  and  unpredictable. 
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Detection  Performance  for  Intensity  Modulated  Spectrograms 

Time  history  displays  of  signal  spectra  representing  amplitude  gray 
scale  shadings  (Kay  Sonagraph,  etc. ) are  widely  used  in  signal  processing 
studies.  The  spectrograms  evaluated  here  were  made  by  passing  signals 
through  an  analog  spectrum  analyzer  and  recording  the  spectra  on  35  mm 
strip  film  by  photographing  a C.  R.  T.  display.  The  amplitude  function  re- 
corded is  the  magnitude  of  the  spectrum  as  a function  of  frequency.  Con- 
tinuous film  motion  produces  a time  history  record  of  successive  spectra. 

For  display  evaluation,  a random  noise  generator  supplied  the  "noise 
only"  conditions.  A sine  wave  of  constant  frequency  and  known  power  was 
added  to  the  random  noise  for  signal  plus  noise  cases.  Three  signal -to-noise 
ratios  were  selected  to  span  the  general  region  of  10^  to  90^  PD  for  a PFA 
of  about  25t.  A total  of  600  cases  were  prepared — approximately  300  noise 
only  and  about  100  each  at  the  three  signal -to -noise  ratios.  Observer  en- 
durance and  the  cost  of  testing  limited  both  the  number  of  sample  observa- 
tions and  the  number  of  observers. 

Each  test  sample  was  a piece  of  35  mm  film.  Figures  4a,  4b, 

4c  and  4d  reproduce  samples  that  were  shown  to,  the  observer  for  training 
prior  to  testing.  Each  film  section  represents  a time  bandwidth  product  of 
about  70  over  a frequency  span  of  about  40  spectral  lines  (resolution  elements). 
If  present,  the  sine  wave  signal  would  appear  at  the  center  of  the  frequency 
range  (vertical).  Thus,  the  general  background  was  always  random  noise. 

The  actual  test  samples  were  mounted  individually  on  cards  placed  in  a rotary 
file.  The  rotary  file  presents  only  one  card  in  view  at  a time,  preventing, 
comparisons  of  different  test  samples.  Each  card  was  numbered  for  use  by 
the  observer  in  rating  and  for  use  by  the  scorer  in  calculating  R(X:  curves. 

The  600  samples  were  divided  into  5 groups  of  120  samples  each.  The  ob- 
server does  only  one  group  at  a time  to  avoid  fatigue.  Our  experience  shows 
that  an  experienced  observer  can  estimate  probabilities  for  120  test  cases 
in  about  15  minutes. 

The  following  rating  scale  instructions  were  given  to  each  observer 
on  a sheet  which  he  kept  with  him  during  the  tests. 

Meaning 

Absolutely  sure  target  is  present 


501t  probability  target  is  present 


Hint  of  target 

See  no  indication  target  is  present 
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The  observer  was  instructed  orally  to  use  all  the  rating  levels  as 
suitable.  Test  case  samples  (Figure  4)  were  shown  to  the  observer  before 
a test.  The  observer  could  examine  these  samples  for  as  long  as  he  chose, 
but  the  training  sheet  was  removed  before  the  test  began.  All  of  the  observers 
for  this  particular  display  evaluation  were  already  acquainted  with  the  in- 
terpretation of  the  intensity  modulated  spectrogram. 

Computing  ROC  curves  from  the  observers  ratings  was  done  manually. 
The  observers  wrote  their  probability  ratings  in  spaces  numbered  the  same 
as  the  test  cards.  Master  code  sheets  enabled  the  scorer  to  assign  ratings  to 
the  proper  noise  or  noise  plus  signal  categories.  The  scorer  then  totalized  the 
number  of  ratings  at  each  level  to  fill  in  summary  sheets  such  as  the  one  in 
Figure  5.  For  each  condition — noise  only  and  the  three  signal -to -noise  ratios -- 
the  number  of  cases  rated  at  each  level  of  probability  (0  through  10),  the  cumu- 
lative sums  going  from  the  highest  level  to  lowest,  and  the  of  the  total  number 
of  cases  in  a category  is  calculated.  The  final  PFA  and  PD  values  are  collected 
in  a block  on  the  lower  right  part  of  the  figure.  It  should  be  noted  that  for  each 
PFA  there  is  a corresponding  PD  at  each  signal -to -noise  ratio.  However,  for 
PFA  = 0 (at  10  level)  no  PD  was  computed  and  plotted. 


SUMMARY  SHEET  FOR  PIX>T  OF  OtlSERVERS  t«x; 


The  calcmated  ROC  curves  for  four  observers  are  reproduced  in 
Figure  6.  There  are  three  sets  of  ROC  plots  representing  three  different 
signal -to -noise  conditions.  There  is  good  agreement  among  observers'  esti- 
mates of  the  ROC  curve  as  shown  by  the  clustering  of  plots  for  a particular 
signal -to -noise  and  the  clear  separation  between  clusters  of  plots  for  the 
signal -to-noise  ratios  which  differed  in  3 db  steps.  One  observer  (HS)  applied 
a restricted  range  of  thresholds  relative  to  the  other  three  observers,  but  his 
"short"  ROC  curves  coincide  with  corresponding  parts  of  the  other  plots. 
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PROBABILITY  OF  A FALSE  ALARM  IN  % 

Figure  6. 


A comparison  of  these  ROC  curves  with  theoretical  detection  per- 
formance* is  shown  in  Figure  7.  Theoretical  curves  are  shown  for 
PFA  = 0.01,  0. 1,  and  0.3.  Corresponding  "mean"  values  are  shown  for 
the  spectrogram  ROC  curves.  The  time -bandwidth  products  (wt  = 60  vs  70) 
are  slightly  different,  but  the  error  is  negligible.  The  detection  performance 
for  the  visual  display  seems  to  lie  about  3 db  below  theoretical --a  seemingly 
reasonable  difference  in  the  light  of  known  imperfections  in  the  spectrum 
analysis  process  used. 
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Figure  7. 


♦See  Appendix  A of  this  report  for  calculation  of  Square  Law  Detector 
Performance 
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Conclusions 


The  rating  scale  method  for  generating  RCX:  curves  of  detection 
performance  seems  to  have  worked  well  here.  Mixing  cases  with  different 
signal -to -noise  ratios  generated  ROC  plots  with  detection  performance  that 
varied  with  expected  relations,  as  shown  by  comparison  to  theoretical  ROC. 
Since  all  of  the  observers  examined  the  same  test  cases,  the  spread  between 
ROC's  is  caused  by  differences  between  observers.  Oddly  enough,  the  most 
experienced  observer  (DW)-had  the  lowest  PD  values.  Certainly  tests  with 
more  than  four  observers  would  be  required  to  predict  performance  variation 
among  observers. 
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SQUARE  LAW  DETECTOR  PERFORMANCE  - by  R.  R.  Rustay 

This  Appendix  briefly  describes  the  computational  procedure  used 
to  obtain  the  curves  on  Figure  7.  The  computational  procedure  is  based 
directly  on  the  report 

A Statistical  Theory  of  Target  Detection  by  Pulsed  Radar: 
Mathematical  Appendix 

J.I.  Marcum,  The  Rand  Corporation  Report  RM-753,  1 July  1948 
Reissued  25  April  1952 

The  "square  law"  model  being  considered  is  shown  in  the  following  sketch* 


A sin  (W  t t) 
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where  n(t)  is  narrow  band  Gaussian  noise  centered  about  Wq  and  ♦ is  a 
random  uniformly  distributed  (0,2n)  constant.  The  probability  density  function 
(pdf)  Qe  (S/A)  associated  with  sum  S 
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♦The  constant  B was  included  for  purpose  of  normalizing  if  desired. 


The  mean  S and  variance  vara  are 
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The  probability  of  false  alarm  PFM  is,  given  the  threshold  Y 
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and  corresponding  probability  of  detection  PD 
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PFM  and  PD  can  be  expressed  in  terms  of  defined  functions,  i.  e., 
PPM  = 1 - , N-1^  = Qy»  ^2^B/2n) 


PD 
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where 


I (U  < p)  s Pearson's  Incomplete  Qamma  Function 
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Q ,(X*  I v)  = Comidementary  Chi-Square  Probability  Function 

s 

Ty(m,  n,  r) « Incomplete  Toronto  Function 

However,  because  tables  of  these  functions  are  either  not  readily 
available  or  convenient,  the  integral 


QsCSlA)  ds 
r 

has  been  computed  digitally  using  an  Edgeworth  form  of  a Gram-Charlier 
Series,  i.  e. , 
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Comimtationally  a PFM  is  selected,  Yq  found  by  iteration,  and  then 
PD  is  computed  for  various  signal -to -noise  power  ratios  X.  Notice  that  the 
first  term  in  the  Oram-Charlier  Series  is  the  normal  approximation. 
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ll<lb  PPtilUlipL  PPtCKAn  IS  naSLU  On  NAHCUH'S  ANALYSIS 
(SUPPLkiikNlLU  in  kCk  ANaLiSiS)  ANII  IM'UIS  N iNULPkNlJANT 
SANPLtS  fhUN  A SOUAKt  LAP  ULll-tilllN  Ul  AlllllllHE  NAKKUNUANU 
GAUSSlAil  NUISL  ANii  a CW  SiOkAI.  ALSO  lNPllllkt>  IS  TllE 
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CUHPIIIbS  IHI.  HHUbArtlLllY  ul  IILIbCllUN  f UP  VAKIOUS 

SIGNAL  III  NOlSb  PuPLP  PaIimS  (VaKIAhLL.  Uh  In  PRUUNAN).  THE  RANIiL 
01  S/M  IS  CUMIPullkli  in  OulLlIN  PAKANI-.  UKS.  OH  IS 
RbLATtU  Hi  liHK  bSkU  HT  aInCaIu  aNii  IS  UPEATkR  THAN  iiU  NY 
lliAL.UUl  U ( 2N  ) . 

llib  lUNCnOh.  I'kUH  CuMAIn:.  iHfc  GPaN  CMAKLIEP  SbNiLS 
AbPKUAlMA  MUN«  iHb  MKSI  ll.uH  III  wHiCll  nOULU  uE  iHk  NUHNAL 
DISIKIHuIIUm  APpPl.XIMAI  IOw. 
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